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Abstract 

Mutations in -100 genes cause muscle diseases with complex and often unexplained genotype/phenotype 
correlations. Next-generation sequencing studies identify a greater-than-expected number of genetic variations in the 
human genome. This suggests that existing clinical monogenic testing systematically miss very relevant information. 
We have created a core panel of genes that cause all known forms of nonsyndromic muscle disorders (MotorPlex). It 
comprises 93 loci, among which are the largest and most complex human genes, such as 777V, RYR1, NEB and DMD. 
MotorPlex captures at least 99.2% of 2,544 exons with a very accurate and uniform coverage. This quality is highlighted 
by the discovery of 20-30% more variations in comparison with whole exome sequencing. The coverage homogeneity 
has also made feasible to apply a cost-effective pooled sequencing strategy while maintaining optimal sensitivity and 
specificity. 

We studied 177 unresolved cases of myopathies for which the best candidate genes were previously excluded. We have 
identified known pathogenic variants in 52 patients and potential causative ones in further 56 patients. We have also 
discovered 23 patients showing multiple true disease-associated variants suggesting complex inheritance. Moreover, we 
frequently detected other nonsynonymous variants of unknown significance in the largest muscle genes. Cost-effective 
combinatorial pools of DNA samples were similarly accurate (97-99%). 

MotorPlex is a very robust platform that overcomes for power, costs, speed, sensitivity and specificity the gene-by-gene 
strategy. The applicability of pooling makes this tool affordable for the screening of genetic variability of muscle genes 
also in a larger population. We consider that our strategy can have much broader applications. 
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Introduction 

Muscle genetic disorders comprise about 100 different 
genetic conditions [1,2], characterized by a clinical, genetic 
and biochemical heterogeneity. The molecular diagnosis 
for myopathic patients is crucial for genetic counseling, 
for prognosis and for available and forthcoming mutation- 
specific treatments [3-5]. In addition, patients that share 
the same mutation may have a different type of muscle 
affection with the selective involvement of other muscle 
compartments or myocardial damage. Thus, the primary 
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defect may be modified or not by additional and variable 
elements that may be genetic or not. The most severe 
cases of congenital or childhood-onset myopathies often 
result from mutations in genes encoding proteins belong- 
ing to common pathways [6]. To provide a clue to address 
genetic testing, a muscle biopsy is often required that may 
be useful, but not well accepted by patients. The single 
gene testing can be diagnostic only in patients with most 
recognizable disorders. In unspecific cases of muscular 
diseases, however, no effective methodology has been 
developed for the parallel testing of all disease genes 
identified so far [7] . 

Next-generation sequencing (NGS) is changing our 
view of biology and medicine allowing the large-scale 
calling of small variations in DNA sequences [8]. In the 
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last few years, the whole-exome sequencing (WES) and 
whole-genome sequencing (WGS) have received wide- 
spread recognition as universal tests for the discovery of 
novel causes of Mendelian disorders in families [9]. The 
power to discover a novel Mendelian condition increases 
with the family size, even if successful studies, identifying 
novel disease genes from multiple small families with the 
same phenotype, have been published [10]. Structural and 
copy number variations are not well detected by NGS 
technologies [11-14]. However, the WES/WGS use for 
the clinical testing of isolated cases is still debated. First, 
there are ethical issues linked to the management of the 
incidental findings [15]. The second limitation is given by 
the practical problem that the coverage is usually too low 
for clinical diagnosis. Hence the cost-effectiveness is 
reduced, considering that WES/WGS may require either 
numerous validation procedures, mainly based on conven- 
tional PCR and Sanger sequencing reactions [16]. Innova- 
tive strategies of clinical exome sequencing at high 
coverage have been described [17], but the cost for a 
single patient is still too high for routine diagnosis. Thus, 
there is still space for targeted strategies [18] and the 
HaloPlex Target Enrichment System [19] represents an in- 
novative technology for targeting, since it uses a combin- 
ation of eight different enzyme restriction followed by 
probe capture. It permits a single-tube target amplification 
and one can accurately predict the precise sequence 
coverage in advance. We have developed a NGS targeting 
workflow as a single testing methodology for the diagnosis 
of genetic myopathies that we named Motorplex. Here we 
demonstrate the high sensitivity and specificity of Motor- 
plex. We challenged our platform against complex DNA 
pools. Even with this complexity, Motorplex kept produ- 
cing reliable data with high sensitivity and specificity 
values. Furthermore, pooling reduced the cost of the 
entire analysis at negligible values, implementing applica- 
tions for large studies of populations [16,20]. 

Materials and methods 

Patients 

Encrypted DNA samples from patients with clinical 
diagnosis of nonspecific myopathies, congenital myopathy, 
proximal muscle weakness or limb-girdle muscular dys- 
trophy (LGMD) were included. The Italian Networks of 
Congenital Myopathies (coordinated by C.B. and F.M.S.) 
of LGMD (by F.M. and G.P.C.) were involved together 
with a large number of other single clinical centers. We 
asked all them the possibility to share more clinical and 
laboratory findings, when necessary. We also requested to 
provide information on familial segregation and previous 
negative genetic tests. Internal patients signed a written 
informed consent, according to the guidelines of Telethon 
Italy and approved by the Ethics Committee of the 
"Seconda Universita degli Studi di Napoli", Naples, Italy. 



DNA samples were extracted using standard procedures. 
DNA quality and quantity were assessed using both 
spectrophotometric (Nanodrop ND 1000, Thermo Scien- 
tific Inc., Rockford, IL, USA) and fluorometry-based (Qubit 
2.0 Fluorometer, Life Technologies, Carlsbad, CA, USA) 
methods. 

In silico design of MotorPlex 

We included in the design all the 93 genes that are 
universally considered as genetic causes of nonsyndromic 
myopathies (Additional file 1: Table SI). In particular, we 
only selected genes determining a primary skeletal muscle 
disease, such as underlying muscular dystrophies, congeni- 
tal myopathies, metabolic myopathies, congenital muscular 
dystrophies, Emery-Dreifuss muscular dystrophy, etc. We 
therefore excluded loci associated with other neuromuscu- 
lar and neurological disorders such as congenital myasthe- 
nias, myotonic dystrophy, spinal muscular atrophy, ataxias, 
neuropathies, or paraplegias for which differential diagno- 
sis may be clinically possible. For each locus, all predicted 
exons and at least ten flanking nucleotides were always in- 
cluded in the electronic design by the custom NGS Agilent 
SureDesign webtool. Setting the sequence length at 100x2 
nucleotides, the predicted target size amounted to 2,544 
regions and 493.598kb. Around 20% of the target is repre- 
sented by TTN coding regions. 

NGS workflow 

For library preparation of single samples, we followed 
the manufacturer's instructions (HaloPlex Target Enrich- 
ment System For Illumina Sequencing, Protocol version 
D, August 2012, Agilent Technologies, Santa Clara, CA, 
USA). We started using 200ng of genomic DNA and 
strictly followed the protocol, with the exception that 
restricted fragments were hybridized for at least 16-24 
hours to the specific probes. After the capture of bio- 
tinylated target DNA, using streptavidin beads, nicks in 
the circularized fragments were closed by a ligase. Finally, 
the captured target DNA was eluted by NaOH and ampli- 
fied by PCR. Amplified target molecules were purified 
using Agencourt AMPure XP beads (Beckman Coulter 
Genomics, Bernried am Starnberger See, Germany). 

The enriched target DNA in each library sample was 
validated and quantified by microfluidics analysis using 
the Bioanalyzer High Sensitivity DNA Assay kit (Agilent 
Technologies) and the 2100 Bioanalyzer with the 2100 
Expert Software. Usually 20 individual samples were run 
in a single lane (250M reads), generating 100-bp paired 
end reads. 

For Pool-Seq experiments, equimolar pools of 5 or 16 
DNA samples (detector and scouting pools) were created 
and 200ng of each pool was used for the HaloPlex enrich- 
ment strategy. Sixteen detector and five scouting pools 
were usually run in a single HiSeqlOOO lane. 
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Targeted sequencing analysis 

The libraries were sequenced using the HiSeqlOOO system 
(Illumina inc., San Diego, CA, USA). The generated se- 
quences were analyzed using an in-house pipeline de- 
signed to automate the analysis workflow, composed by 
modules performing every step using the appropriate tools 
available to the scientific community or developed in- 
house [21]. Paired sequencing reads were aligned to the 
reference genome (UCSC, hgl9 build) using BWA [22], 
sorted with Picard (http://picard.sourceforge.net) and lo- 
cally realigned around insertions-deletions with Genome 
Analysis Toolkit (GATK) [23]. The UnifiedGenotyper al- 
gorithm of GATK was used for SNV and small insertions- 
deletions (ins-del) calling, with parameters adapted to the 
Haloplex-generated sequences. The analysis of pools was 
performed with UnifiedGenotyper as well, adapting the 
ploidy parameter to the number of chromosomes present 
in the samples (10 for the detector and 32 for the scout 
pools) and the minimal ins-del fraction parameter accord- 
ingly. The called SNV and ins-del variants produced with 
both platforms were annotated using ANNOVAR [24] 
with: the relative position in genes using RefSeq [25] gene 
model, amino acid change, presence in dbSNP vl37 [26], 
frequency in NHLBI Exome Variant Server (http://evs.gs. 
washington.edu/EVS) and 1000 genomes large scale 
projects [27], multiple cross-species conservation [28,29] 
and prediction scores of damaging on protein activity 
[30-33]. The annotated variants were then imported into 
the internal variation database, which stores all the varia- 
tions found in the re-sequencing projects performed so far 
in our institute. The database was then queried to gener- 
ate the filtered list of variations and the internal database 
frequency in samples with unrelated phenotype was used 
as further annotation and filtering criteria. The alignments 
at candidate positions were visually inspected using the 
Integrative genomics viewer (IG V) [34] . We selected from 
the database the non-synonymous SNVs and ins-del, with 
a frequency lower than 2%, which was followed by manual 
inspection and further filtering criteria based on the pres- 
ence in unrelated samples of the database, on the presence 
in the other samples of the Motorplex experiment and on 
the conservation of the mutations, with a final selection of 
rare, possibly causative, variations per individual. 

Results 

Validation study of MotorPlex 

To design MotorPlex we used a straightforward procedure. 
Briefly, disease genes causing a muscular phenotype, in- 
cluding the biggest genes of the human genome, like titin 
(TTN) or dystrophin (DMD), were selected. The target 
sequences, corresponding to 0.5Mbp were enriched by the 
HaloPlex system (see Materials and methods). To validate 
MotorPlex, we created a training set of twenty DNA sam- 
ples belonging to patients (15 males and 5 females) affected 



by different forms of limb-girdle muscular dystrophy or 
congenital myopathy (Additional file 2: Table S2) and 
compared with data from whole exome sequencing (WES) 
(Figure 1). For each sample, about 98% of reads gener- 
ated (Figure la and Additional file 3: Table S4) were 
on target (compared to 88% obtained by WES) and 
fewer than 0.5% of targeted regions were not covered 
(about 15% of human exons are not analyzed by WES, 
Additional file 4: Figure SI). Moreover, more than 95% 
of targeted nucleotides were read at a lOOx depth and 
a 500x depth was obtained for 80% of these; on the 
contrary, by performing a WES analysis, fewer than 
70% of exons were covered at 20x (Figure lb). From 
previous amplicon Sanger sequencing from these 
samples, we knew about 84 variants in 17 different 
genes (Additional file 5: Table S3). All these known 
variants were correctly called and no additional change 
was seen within the sequenced target (100% sensitivity 
and specificity). Moreover, to assess the reproducibility 
of the targeted enrichment and the subsequent NGS 
workflow, the same sample (43U) was analyzed twice. 
After filtering, variants were always confirmed, includ- 
ing the putative causative one (Table 1). Outside the 
Sanger coverage, 4,991 additional variations were called 
(Additional file 6: Table S5). 

Validation study of double-check pooling 

To challenge MotorPlex to be applied to large studies on 
thousands of patients and/or to detect mosaic muta- 
tions, we designed a combinatorial pooling strategy. 
After some initial attempts with pools of identical sizes, 
we changed our strategy. The general arrangement was 
to have the same sample in two different independent 
pools, composed of two exclusive combinations of sam- 
ples (Figure 2). This permitted us to identify both the 
rare variations and the sample mutated. In particular, 
the pools were organized in two groups: the "detector 
pool" only containing five samples (10 alleles) that had 
the purpose of detecting variations with the optimal 
sensitivity and the "scout pool" composed of 16 samples 
(32 alleles) that confirmed the variation(s) and attribute 
them univocally to distinct DNA samples (Additional file 7: 
Figure S2; Additional file 8, Table S6). We paid attention 
each time to include the index cases alone, excluding 
related family members. 

To validate this arrangement, we selected five samples 
that we previously sequenced individually and called 1,235 
variations. We pooled them in the same detector pool 
(P9) and then reanalyzed in different scout pools. Impres- 
sively, in pool P9 we called 1,232/1,235 variations belong- 
ing to the individual samples, calculating the sensitivity 
value at 99.8%. The three missing variations (an insertion 
in RRM2B and two point variants in TTN) were located 
in regions with lower coverage. On the contrary, no 
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Figure 1 A comparison between MotorPlex and a Whole Exome strategy (WES) demonstrates the better performance of the targeted 
strategy, (a) 97.75% of reads generated in a MotorPlex experiment fall in the regions of interest and only 0.67% of targeted regions are not 
sequenced. On the contrary, for WES 88.66% of reads are on target and 14.89% of targeted exons are not effectively covered, (b) The percentage 
of targeted regions covered at high depth by MotorPlex is higher than that obtained by WES. In particular, 96.01 % and 81 .6% of regions are, 
respectively, covered at 100x and 200x by using MotorPlex versus 35.49% and 1.90% by WES. 



variation was called in pool 9 in addition to those of 
individual samples, demonstrating the absence of false 
positives and artefacts due to the pooling strategy. 
Another two samples from the training set were 
inserted in another two detector pools, showing similar 
results. 

We then confirmed 223/230 (97%) variations tested by 
Sanger sequencing, thus providing the specificity value of 
the method. Moreover, the combined use of detector 
pools and scout pools allowed us to "clean" the results. 
50% of off target variations (n=l,291), in fact, were not 
called in the scout pools and were easily filtered off during 
bionformatic analysis. In addition, about 25% of variants 
in low covered regions (<500 total reads), representing in 
a large percentage false positive calls, were similarly 
filtered off because they were not detected in the scout 
pools (Additional file 9: Figure S3). 

Variants and interpretation 

The targeted analysis of 93 genes showed a total of 
23,109 rare variants (<0.01 frequency) in 173 patients 
(1.4 variants/gene/patient). To provide a preliminary 



interpretation in relationship with the clinical suspicion, 
we set bioinformatic filters that weigh the variant class 
(missense, indel, stopgain or stoploss), the calculated 
frequency in public and internal databases and the annota- 
tion as causative variants. Finally, we reconsidered critically 
the correspondence with the clinical presentation, the age 
at onset and the segregation in familial cases. 

In detail, we identified 52 patients (52/177=29%) with 
variants of likely pathogenicity or predicted to affect 
function (Table 1 and Additional file 10): most of them 
(38/52=73%) had known or truncating variants (indel, 
stopgtain or stoploss). Five patients (5/52=9.6%) showed 
a novel variant in addition to a pathogenic allele in a 
recessive gene. The remaining samples (9/52=17%) had 
novel variants that are predicted to affect function in 
genes fitting with the clinical suspicion. 

In other 56 samples (56/177=32%), we identified 
potential causative variants (Table 2 and Additional file 10). 
In these cases, there was only a partial correspondence 
with the clinical phenotype. For example, a number of 
variants had been previously associated with cardiomyop- 
athy, but their pathogenic role in congenital myopathy or 



Table 1 List of pathogenic variants 



Sample ID 


Sex 


Clinical diagnosis 


Inheritance 


Histopathologic features 


Variant(s) 












Singlel 


M 


CM 


Sp 


c.n. 


DNM2 


chrl 9:1 0934538* 


c.1856 C>G 


p.S619W 


het 


c.n. srl 


Single3 


M 


LGMD 


Sp 


m.f. 


CAPN3 


chrl 5:42695076* 


c.1621 C>T 


p.R541W 


het 


LGMD 51 ' 2 












CAPN3 


chrl 5:426821 42* 


c.802-9G>A 


spl. 


het 


LGMD 51 ' 3 


Single6 


M 


LGMD 


Rec 


m.f. 


FKRP 


chrl 9:47259458 


c.751G>T 


p.A251S 


het 














FKRP 


chrl 9:47259758 


C.G1051C 


p.A351P 


het 




Single8 


M 


LGMD 


Sp 


n.a. 


DYSF 


chr2:71 838708 


c.4119C>A 


p.N1373K 


het 














DYSF 


chr2:71762413 


c.1369G>A 


p.E457K 


het 




Singlel 5 


F 


LGMD/CM 


Sp 


d.f. 


SYNE2 


chrl 4:64688329 


c.663G>A 


p.W221X 


het 




Singlel 6 


M 


LGMD/DCM 


Sp 


d.f. 


SGCG 


chrl 3:23869573* 


c.525 delT 


p.F175L fsX20 


horn 


LGMD 51 ' 4 












LDB3 


chrl 0:88446830* 


c.349G>A 


P.D117N 


het 


DCM sr5 


Singlel 9 


M 


LGMD 


Sp 


m.f. 


RYR1 


chrl 9:39062797* 


c.13885G>A 


p.V4629M 


het 


CM 5r6 


Single20 


M 


LGMD/DCM 


Rec 


c.n. 


RYR1 


chrl 9:39009932* 


c.10097G>A 


p.R3366H 


het 


Multiminicore ! 












RYR1 


chrl 9:38973933* 


c.4711 A>G 


p.M 571V 


het 


MH 5rS 












RYR1 


chrl 9:390341 91* 


c.11798A>G 


p.Y3933C 


het 


MH 5r9 












RYR1 


chrl 9:38942453 


C.G1172C 


p.R391P 


het 














DES 


chr2:220284876* 


c.638 C>T 


p.A213V 


het 


DCM 10 


1/1 7s 


F 


CM 


Sp 


c.n. 


TTN 


chr2:1 79452695* 


c.63439G>A 


p.A21157T 


het 


ARVD 5rl 1 












TTN 


chr2:1 79496025 


C.G43750T 


p.G14584X 


het 














TTN 


chr2:1 79392277* 


c.107576T>C 


p.M35859T 


het 


ARVD 5rl 1 


1/21 s 


M 


LGMD 


n.a. 


n.a. 


SGCA 


chrl 7:48246607* 


c.739G>A 


p.247V>M 


het 


LGMD 51 ' 12 












SGCA 


chrl 7:48245758* 


c.409G>A 


p.E137K 


het 


LGMD 51 ' 13 


2/1 7s 


F 


CM 


Sp 


cftdm 


MYH7 


chrl 4:23886406 


C.T4475C 


p.L1492P 


het 




2/20s 


M 


LGMD 


n.a. 


n.a. 


POMT2 


chr14:77745129* 


c.1975 C>T 


p.659 R>W 


het 


CMD srM 












POMT2 


chrl 4:77769283* 


c.551 C>T 


p.T184M 


het 


LGMD sr15 


3/20s 


F 


LGMD 


Sp 


cftdm 


TPM2 


chr9:35689792* 


c.20_22delAGA 


p.7Kdel 


het 


CM srl6 


4/1 7s 


M 


LGMD 


Rec 


c.n. 


AN05 


chrl 1:22242646* 


AN05:c.191dupA 


p.64N>Kfs*15 


horn 


LGMD 51 ' 1 ' 


4/1 8s 


M 


LGMD 


Sp 


vacuoles 


DNAJB6 


chr7:1 571 75006 


c.413G>A 


p.G138E 


het 




5/1 7s 


M 


LGMD/DCM 


Sp 


m.f. 


MYOT 


chr5:1 3721 3267 


c.591delTG 


p.199F>S fsX3 


het 




5/21 s 


M 


LGMD 


Sp 


c.n. 


CAV3 


chr3:8787288* 


c.191C>G 


p.T64S 


het 


HCM 5rlB 


6/20s 


M 


LGMD 


Sp 


d.f. 


ACADVL 


chrl 7:71 27330* 


C.G1376A 


p.R459Q 


het 


VLCAD 5r19 












ACADVL 


chr17:7128130 


C.C754T 


p.A585V 


het 




7/1 7s 


M 


LGMD 


Sp 


m.f. 


CAPN3 


chrl 5:42702843* 


c.2242 C>T 


p.R748X 


het 


LGMD 51 ' 20 



zr i/i 
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Table 1 List of pathogenic variants (Continued) 



7/20s 
8/1 9s 
1 0/1 7s 
1 0/2 1 s 
11/1 8s 

12/1 8s 
1 2/21 s 

1 3/20s 

1 3/21 s 
14/20s 

14/1 8s 
15/1 9s 
1 6/1 8s 
1 6/20s 

1 6/2 1 s 
23/38s 



23/41 s 
24/42s 
25/38s 
25/39s 
25/41 s 
28/39s 
28/41 s 
29/41 s 



F 

M 
F 
M 
M 

F 

F 



M 

F 

M 
M 
M 
M 

F 

M 



M 
F 
M 

F 
F 
F 
M 

F 



LGMD 
LGMD 
CM 

LGMD/FSHD 
CM 

CM 
LGMD 

LGMD 

LGMD 
LGMD 

LGMD 
CM 
LGMD 
CM 

CM 
CM 



CM 
CM 
CM 
CM 
CM 
CM 
CM 
CM 



Sp 

n.a. 

Sp 

Dom 

Sp 

Sp 
Sp 

Rec 

Sp 
n.a. 

n.a. 
Sp 
Sp 
Sp 

Dom 
Sp 



Sp 

n.a. 

Sp 

Dom 

n.a. 

Dom 

Sp 

Rec 



d.f. 
d.f. 
m.f. 
d.f. 

nemaline 

cftdm 
d.f. 



d.f. 
n.a. 

d.f. 

multiminicores 
no alterations 
cftdm 

n.a. 
cftdm 



m.f. 

n.a. 

cftdm 

c.n. 

n.a. 

minicore 

c.n. 

n.a. 



CAPN3 

LMNA 

DNAJB6 

MYH7 

SMCHD1 

NEB 

NEB 

MYH7 

PYGM 

PYGM 

LAMA2 

LAMA2 

SGCG 

CAPN3 

CAPN3 

DMD 

MYH7 

CAPN3 

TTN 

TTN 

TPM2 

RYR1 

RYR1 

RYR1 

RYR1 

ACTA1 

CRYAB 

RYR1 

MYH7 

MYH7 

MTM1 

NEB 



chrl 5:42693952* 


c.1468 C>T 


p.R490W 


het 


LGMD sr21 


chr1:1 561 00408* 


c.357 C>T 


p.R119R (spl.) 


het 


EDMD sr22 


chr7:1 571 55959 


C.C170T 


p.S57L 


het 




chr14:23886518 


C.G4363T 


p.E1455X 


het 




chrl 8:2700849* 


C.C1580T 


p.T527M 


het 


FSHD sr23 


chr2:1 52447860 


c.6915+2T>C 


spl. 


het 




chr2:1 52553662 


C.C1470T 


p.D490D (spl.?) 


het 




chrl 4:23882063 


C.G5808C 


p.X1936Y 


het 




chrl 1:64519958 


C.A1537G 


P.I513V 


het 




chrl 1:64514809* 


C.C2199G 


p.Y733X 


het 


McArdle sr24 


chr6:1 29722399* 


C.C5476T 


p.R1826X 


het 


LGMD 51 ' 25 


chr6:1 29571 264 


c.1791_1793del AGT 


p.598 del V 


het 




chrl 3:23898652* 


c.848G>A 


p.C283Y 


horn 


LGMD sr26 


chrl 5:42686485* 


c.1061T>G 


p.V354G 


het 


LGMD sr21 


chrl 5:42689077 


c.1193+2T>C 


spl. 


het 




chrX:32360366* 


C.G5773T 


p.E1925X 


hem 


Duchenne sr27 


chr14:23885313* 


c.4850_4852del 


p.1617 del K 


het 


Distal" 28 


chrl 5:42691 746* 


c.1250 C>T 


p.T41 7M 


horn 


LGMD 51 ' 29 


chr2:179431175 


C.C79684T 


p.R26562X 


het 




chr2:1 79526510 


C.A39019T 


p.K13007X 


het 




chr9:35685541* 


C.A382G 


p.K128E 


het 


CFTD sr3 ° 


chrl 9:38959672 


c.3449delG 


p.C1 1 50fs 


het 




chrl 9:389851 86 


c.6469G>A 


p.E2157K 


het 




chrl 9:390031 08* 


c.9457G>A 


p.G3153R 


het 


MH sr31 


chrl 9:38990637* 


C.G7304T 


p.R2435L 


horn 


CCD sr32 


chr1:229567867* 


C.G682C 


p.E228Q 


het 


Nemaline sr33 


chrl 1:11 1 779520 


C.A496T 


p.K166X 


het 




chrl 9:3907561 4* 


c.14678G>A 


p.R4893Q 


het 


CCD sr34 


chrl 4:23886750 


C.G4315C 


p.A1439P 


het 




chr14:23885313* 


c.4850_4852del 


p.1617del K 


het 


Distal" 28 


chrX:149831996* 


C.C1558T 


p.R520X 


hem 


Myotubular" 35 


chr2:1 5238761 7 


c.21 628-2 A>T 


spl. 


het 
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Table 1 List of pathogenic variants (Continued) 













NEB 


chr2:1 52541 300 


C.C2827T 


p.Q943X 


het 




30/42s 


F 


CM 


Rec 


cftdm 


RYR1 


chrl 9:389481 85* 


C.C1840T 


p.R614C 


het 


MH sr36 












RYR1 


chrl 9:38959747 


C.G3523A 


p.E1175K 


het 




31 /42s 


F 


CM 


Rec 


nemaline 


NEB 


chr2:1 52471 093 


c.11298_11300delTAC 


p.Y3766del 


horn 




32/41 s 


M 


CM 


Dom 


c.n. 


MTM1 


chrX:1 49826390 


c.1 150 C>T 


p.Q384X 


het 




32/42s 


F 


CM 


Dom 


minicore 


DNM2 


chrl 9:1 093991 7 


C.C2252A 


p.T751N 


het 




33/41 s 


M 


CM 


Rec 


nemaline 


NEB 


chr2:1 52370944 


c.23122-2A>G 


spl. 


het 














NEB 


chr2:1 52544037 


C.A2533G 


p.K845E 


het 




36/42s 


M 


CM 


Dom 


n.a. 


RYR1 


chrl 9:39075629* 


C.T14693C 


P.I4898T 


het 


CCD sr3? 


37/39s 


M 


LGMD 


Sp 


d.f. 


DMD 


chrX:32841417* 


C.T328C 


P.W110R 


hem 


Becker sr3S 


37/40s 


F 


LGMD 


Sp 


n.a. 


SYNE2 


chr14:64676751* 


C.C18632T 


P.T6211M 


het 


EDMD sr39 


37/41 s 


F 


CM 


Dom 


m.f. 


MTM1 


chrX:1 49826390 


c.1 150 C>T 


p.Q384X 


het 





•Already reported. For references, see Additional file 10. 
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Figure 2 NGS targeting workflow. Ninety-three disease genes causing a muscular phenotype were selected. To cover all their exons and the ten 
flanking bases, an enrichment strategy, based on HaloPlex system, was designed. DNA samples of 80 patients were analyzed twice in an independent 
manner, using a combinatorial pooling scheme. As requested by HaloPlex protocol, DNA samples were digested, barcoded and amplified. The 80 samples 
were run at the same time in a single lane of the flow cell of HiSeq 1000. The following data analysis allowed us to detect putative causative variants 
validated by Sanger sequencing. 



in LGMDs was not yet established. To the group 
belong patients having two rare variants in TTN gene 
or at least one variant in COL6A1, COL6A2, COL6A3, 
SYNE1, SYNE2 and FLNC genes. These molecular 
findings in these 56 samples were not considered 
strictly disease-causing and further tests are required. 

The most surprising finding was, however, the 
presence of additional damaging or potential damaging 
variants in 16 patients of the first two groups (23/ 
108=21%) in whom other pathogenic variants or vari- 
ants of uncertain significance had already been identi- 
fied. These variants, if they had been detected alone in 
the context of a single gene testing, would have been 
considered as causative. 

The third group includes 26 patients (26/177=15%) in 
which we discovered a single truncating variant (or a 
known disease-associated variant) in a recessive gene 
that is compatible with the phenotype. The second allele 
may carry a RNA splicing defect that is generally not 
predictable by DNA sequencing or, also, a variation in 
not investigated promoters or regulatory regions. 



Discussion 

In the last decade, a remarkable progress has been made in 
discovering new disease genes and differentiating similar 
muscle disorders [1,2]. This growing genetic heterogeneity 
highlights the problem of a very complex diagnosis [35]. 
Furthermore, genome sequencing studies suggest that the 
clinical genetic test may be incomplete not only when the 
causative mutation is missing, but also when the genotype/ 
phenotype correlation appears weak. This is particularly 
true when the familial recurrence is unclear, with some 
relatives that only share minor affections. In families with 
patients who are more severely affected, this "grey area" is 
problematic for both genetic counselling and forthcoming 
mutation-specific treatments. However, this represents the 
proper challenge for the new genomic, high-throughput 
technologies: the power of discovery has been dramatically 
boosted by the introduction of the next-generation sequen- 
cing (NGS) techniques [13,36-38]. In the NGS era, the 
genetic testing is going to move from few candidate genes 
to broader panels of genes [39] or, ultimately, to the entire 
genome. This will have consequences on the diagnostic 
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Table 2 Variants of unknown significance (Vous) 



Sample 
ID 


Sex 


Clinical 
diagnosis 


Inheritance 


Histopathologic 
features 


Variant(s) 










Single7 


M 


LGMD/EDMD 


Rec 


d.f. 


NEB 


chr2:1 52468776 


C.A11729G 


p.D3910G 


het 














NEB 


chr2:1 52495898 


C.C8890T spl. 


p.R2964C 


het 














COL6A2 


chr2 1:47552071 


c.2665 C>T 


p.Q889X 


het 




Single9 


M 


LGMD 


n.a. 


m.f. 


RYR1 


chrl 9:38986923* 


c.6617 C>T 


p.T2206M 


het 


MH sr40 


Singlel 3 


M 


CM 


Sp 


n.a. 


LAMA2 


chr6:1 29687396* 


c.G4750G>A 


p.G1584S 


het 


LGMD sr41 












LAMA2 


chr6:1 29775423 


c.6697G>A 


p.V2233l 


het 














NEB 


chr2:1 5250681 2 


C.C7309T 


p.R2437W 


het 














NEB 


chr2:1 5251 2781 


C.T6381A 


p.D2127E 


het 




Single14 


F 


LGMD 


Sp 


d.f. 


GOL6A3 


chr2:238249316 


C.C8243T 


p.P2748L 


het 














GOL6A3 


chr2:238289767 


C.A1688G 


p.D563G 


het 




Singlel 8 


M 


CM 


n.a. 


n.a. 


HSPG2 


chr1:221 76684 


c.7296 A>T 


spl. 


het 














HSPG2 


chrl 22200473 


c.3688G>A 


p.G1230S 


het 




1/1 8s 


M 


CM 


Sp 


c.n. 


RYR1 


chrl 9:38990340 


C.G7093A 


p.G2365R 


het 














RYR1 


chrl 9:3901 8347* 


C.b 1 U/4/L 


p.tiboiU 


het 


ii yil_isr42 
IVIH 


2/1 9s 


IV I 


LblVIU/UUVI 


bp 


Q.T. 


NEB 


chr2:1 52404851 


C.b2U 1 28A 


„ \ 1/: -j 1 r\\ 
p.VD/ 1 Ul 


het 














NEB 


chr2:1 5253421 6 


C.Lioi/ 


p. I I z I 3IVI 


het 




3/1 7s 


r 

r 


Lb IVl U 


bp 


cftdm 


SYNE2 


chrl 4:64407373 


c.A1 21 G 


p. 141 V 


het 




A /">1 c 
4/z I S 


IV 1 . 


LoivlU 


Cr-\ 
bp 


H f 
Q.T. 


MYH7 


chrl 4:23882979* 


r A £770T 

cad/ /yi 


r\ 1 1 CH7E 

p. I I y// r 


het 


Mb IVl 












FLNC 


chr7:1 28487762 


C.L4300 


m a o a r~ 

P.K14J4L 


het 




5/1 8s 


iVl 


LGMD 


n.a. 


n.a. 


TTN 


chr2:1 79393000 


c.1 07377 


spl. 


het 


















+ 1G>A 


















TTN 


chr2:1 79441 932 


C.C69130T 


p.P23044S 


het 




5/1 9s 


F 


CM 


n.a. 


n.a. 


TTN 


chr2:1 79439491 


C.C71368T 


p.R23790C 


het 














TTN 


chr2:1 79596569 


C.G17033A 


p.R5678Q 


het 




5/20s 


M 


LGMD 


Sp 


d.f. 


COL6a3 


chr2:238283289* 


C.C3445T 


p.R1149W 


het 


AVSD sr44 












COL6a3 


chr2:238296516 


C.C1021T 


p.R341C 


het 














NEB 


chr2:1 524761 25 


C.G10712G 


p.R3571P 


het 














NEB 


chr2:1 52580847 




p.t\ I OUn 


het 




O/Z I S 


IVl 


Li VI 


Dom 


cftdm 


SYNE1 


chr6:1 52776709 


C.L.Z/44 


v-, T0 1 C I 

p. i y I D\ 


het 














SYNE2 


chrl 4:64468677 


C.L.3004 


r-. D1 "m\A/ 
p.n I zzzVv 


het 




7/1 9s 


vl 


UVl 


Sp 


cftdm 


GOL6A3 


chr2:238287746* 


C.(j2U3UA 


p. no/ /H 


het 


Betniem 


7/1 1 c- 

1 1 A 1 S 


VI 


LulVlU 


Cr-\ 
bp 


normal 


TTN 


chr2:1 79500777 


r- fZA 1 C"}1 A 
C.U4 I jz I A 


p.U I jo4 I IN 


het 














TTN 


chr2:1 7961 5278 


c. 1 1 1 84yL 


p.bybU I 


het 




8/20s 


r 

r 


Lb IVl U 


Sp 


Q.T. 


COL6A3 


chr2:238253701 


C.L/ I 02 I 


p.rzJoob 


het 




















(spl.) 






8/2 1s 


M 


LGMD 


Sp 


d.f. 


SMCHD1 


chrl 8:274071 3 


C.C3527T 


p.T1 1 76I 


het 




1 0/1 8s 


F 


LGMD 


n.a. 


n.a. 


RYR 


chr19:39034191* 


C.A11798G 


p.Y3933C 


het 


MH sr9 


1 0/1 9s 


F 


LGMD 


Sp 


d.f. 


RYR 


chrl 9:38990359* 


C.A7112G 


p.E2371G 


het 


MH sr31 


10/21s 


M 


LGMD 


Sp 


d.f. 


SMGHD1 


chrl 8:2700849 


C.C1580T 


p.T527M 


het 




1 1 /1 7s 


M 


LGMD 


Sp 


T1FP 


FHL1 


chrX:1 35278980 


C.T19C 


p.S7P 


het 




11/19s 


M 


LGMD 


Dom 


m.f. 


MYH2 


chrl 7:1 0446451 


C.A769G 


p.T257A 


het 




11 /20s 


M 


LGMD 


Sp 


normal 


FLNC 


chr7:1 28482964 


C.C2506T 


p.P836S 


het 




1 2/1 9s 


M 


LGMD 


Sp 


d.f. 


COL6A2 


chr21 47545454 


C.T1892C 


p.F631S 


het 
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Table 2 Variants of unknown significance (Vous) (Continued) 



13/18s 


M 


CM 


Sp 


cftdm and 


MYBPC2 


chrl 1:47356715* 


C.C2783T 


p.S928L 


het 


HCM sr46 










multiminicore 




chrl 4:64447727 






het 












SYNE2 


C.A1672C 


p.K558Q 




14/21s 


M 


LGMD 


Sp 


d.f 


RYR1 


chrl 9:39076763 


C.C14901G 


p.D4967E 


het 














RYR1 


chrl 9:39076777 


C.C14915T 


p.T4972l 


het 




1 5/205 


M 


LGMD 


Sp 


normal 


LDB3 


chrl 0:88492723 


C.T2174A 


p.l725N 


het 




15/215 


F 


CM 


Sp 


central core 


PHKA1 


chrX:71 840734 


C.G1978A 


p.V660l 


het 














SYNE1 


chr6:1 5274661 8 


C.C5165T 


p.S1722L 


het 














SYNE2 


chrl 4:64548224 


C.A1 1410G 


p.T3804A 


het 




23/405 


M 


CM 


n.a. 


c.n. 


TMEM43 


chr3:1 41 75304 


C.C578T 


p.S193L 


het 














MYBPC3 


chrl 1:47364189* 


C.G1564A 


p.A522T 


het 


HCM sr47 


24/38S 


M 


CM 


Sp 


cftdm 


TTN 


chr2:1 79559591 


C.G31313A 


p.R10438Q 


het 














TTN 


chr2:1 79586762 


C.C22628T 


p.P7543L 


het 














FLNC 


chr7:1 28475627 


C.C600T 


p.P200P spl. 


het 




24/395 


M 


CM 


n.a. 


n.a. 


FLNC 


chr7:1 28492888 


C.C6011T 


p.S2004F 


het 




24/415 


F 


CM 


n.a. 


n.a. 


TTN 


chr2:1 79495045 


C.A44204G 


p.N14735S 


het 














TTN 


chr2:1 79586756 


C.G22634A 


p.R7545Q 


het 




25/40S 


M 


CM 


Sp 


nemaline 


FLNC 


chr7:1 28494538 


C.G6799A 


p.V2267l 


het 




25/425 


M 


CM 


n.a. 


cftdm 


RYR1 


chrl 9:38986890 


C.C6584T 


p.P2195L 


het 




zo/ jys 


I V I 




bp 


core miopathy 


I IN 


cnrz. 1 / y*t j 1 yz4 


C. I / OJD JL. 


r\ I 7£3 1 ID 

p.LZOj 1 Zr 


het 














TTN 


chr2:1 796141 24 


C.A13003G 


p.R4335G 


het 




26/415 


M 


CM 


n.a. 


n.a. 


DYSF 


chr2:71 740851* 


C.G463A 


p.G155R 


het 


LGMD sr4E 












DYSF 


chr2:71 827853 


C.C3724T 


p.R1242C 


het 




26/425 


M 


CM 


n.a. 


core miopathy 


TTN 


chr2:1 79522230 


C.T38033C 


p.VI 2678A 


het 














TTN 


chr2:1 79527095 


C.C37009T 


p.P12337S 


het 




27/395 


M 


CM 


Sp 


cftdm 


COL6A1 


chr2 1:47406897 


C.C628G 


p.R210G 


het 




27/415 


F 


CM 


n.a. 


cftdm 


SYNE1 


chr6:1 52746682 


C.G5001T 


p.A1701S 


het 




















(spl.) 
















SYNE2 


chrl 4:64484328 


C.G4903A 


p.E1635K 


het 




27/425 


F 


CM 


n.a. 


multiminicores 


COL6A1 


chr2 1:47406559 


C.G548A 


p.G183D 


het 














MYH7 


chrl 4:23885359 


C.G4807C 


p.A1603P 


het 














DNM2 


chrl 9:1 0909210 


C.A1 384G 


p.T462A 


het 




28/40S 


M 


CM 


n.a. 


n.a. 


TTN 


chr2:1 7941 5978 


C.G91280T 


p.G30427V 


het 














TTN 


chr2:1 7941 5952 


C.C91306T 


p.R30436W 


het 




28/415 


M 


CM 


Sp 


d.f. 


COL6A1 


chr21 4741 0893 


C.G1057A 


p.G353S 


het 




29/38S 


M 


LGMD 


Rec 


d.f. 


COL6A2 


chr21 47539756 


C.G1324T 


p.G442W 


het 














COL6A2 


chr2 1:4755 1934* 


C.G2528A 


p.R843Q 


het 


AVSD sr44 


30/385 


F 


CM 


Sd 


n.a. 


TTN 


chr2:1 7941 1904 


C.C94348T 


p.R31450C 


het 














TTN 


chr2:1 79428049 


C.G82814A 


p.G27604S 


het 




31/395 


M 


CM 


Sp 


minicores 


ATP 7 A 


chrX:77301920 


C.G4356C 


p.L1452F 


het 




31/405 


F 


CM 


Sp 


cftdm 


PHKA1 


chrX:71 840734 


C.G1978A 


p.V660l 


het 




31/415 


M 


CM 


Sp 


reducing body 


KBTBD13 


chrl 5:65369638 


C.C485T 


p.T162M 


het 




32/405 


M 


CM 


Sp 


T1FP 


TTN 


chr2:1 79583 104 


C.C24729A 


p.C8243X 


het 














TTN 


chr2:1 79589034 


C.A21068C 


p.Q7023P 


het 




33/385 


F 


LGMD 


Sp 


d.f. 


CNTN1 


chr! 2:41 337835 


C.A1546G 


P.I516V 


het 
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Table 2 Variants of unknown significance (Vous) (Continued) 





r 


i f~Mn 

LolvlU 


->P 


H f 
CJ.r. 


jlvl^nU I 


LI H I O.ZDjDzjU 


Lu 1 / D 1 




het 


34/41 s 


M 


CM 


n.a. 


m.f. 


COL6A2 


chr21 47545473 


C.C1911G 


p.F637L 


het 


35/41 s 


M 


CM 


n.a. 


c.n. 


DYSF 


chr2:71 730384 


c.277G>A 


p.A93T 


horn 












TTN 


chr2:1 7941 1050 


C.C95008T 


p.R31670X 


het 


36/38s 


M 


LGMD 


Sp 


d.f. 


SYNE1 


chr6:1 5265 1958 


C.C15746T 


p.T5249M 


het 


36/39s 


F 


CM 


Sp 


cftdm 


GOL6A2 


chr2 1:47545885 


C.G2156A 


p.R719Q 


het 












CPT1B 


chr22:51012938 


C.G767A 


p.R256H 


het 


36/405 


M 


LGMD and 


Sp 


m.f. 


SYNE2 


chrl 4:64447788 


C.A1733G 


p.K578R 


het 






DCM 
















37/385 


M 


LGMD 


Sp 


m.f. 


COL6a3 


chr2:238277282 


C.A4824T 


p.R1608S 


het 



* Already reported. For references, see Additional file 10. 



flowchart: NGS tests may represent the first tier test, 
preceding biopsy and other invasive procedures. 

We have applied both WES and targeted approaches to 
the diagnosis of genetic disorders of muscle and collected 
DNA samples of patients without diagnosis and realized 
that NGS technology can be helpful for clinical diagnos- 
tics, provided that a suitable tool is created. We traced an 
ideal profile of it. This tool should fulfil the following re- 
quirements [16,20]: 1) to be cost-effective and thus applic- 
able to a large number of patients and normal individuals, 
2) to be robust in the terms of target reproducibility, 3) to 
be specific and sensitive with a limited need for further val- 
idation steps, 4) to be large enough to include all relevant 
genes and, finally, 5) to be easily upgradable in view of 
novel discoveries. Here we demonstrate the ability to gener- 
ate this complex targeting and to fulfil all these require- 
ments. We decided to use Haloplex as the enrichment 
technology. Haloplex first digests DNA using eight different 
combinations of endonucleases. Our experience suggests 
that this approach is more reproducible and accurate than 
the random mechanical DNA fragmentation. In addition, 
the capture is independent of the target base composition 
and is predictable from the probe design phase. As a proof 
of specificity and efficiency, we show that less than 2% of 
reads generated by Motorplex are off-target, in comparison 



with >12% of WES. This factor further improves the cost- 
effectiveness of the approach. This platform, based on eight 
different digestions and hybridization, is more accurate, 
reproducible and sensitive in comparison with other pub- 
lished methods [34]. We have designed the MotorPlex to 
detect variations in 93 muscle-disease genes and assayed 
177 pre-screened DNA samples from myopathic patients. 
It is important to consider that these are all patients with 
zero mutations so far detected, even if most of them have 
been lengthily studied using a gene-by-gene sequencing 
approach. The high coverage and depth obtained permitted 
us to detect variations in most genes with sensitivity com- 
parable with Sanger sequencing. According to our conser- 
vative NGS data interpretation, in 52 patients (29%) the 
diagnosis is complete. However, the detection rate will grow 
after a further molecular characterization of putative patho- 
genic variations in a second group of 56 patients. In 
addition, there are 26 samples (15%) that have defects in 
one single allele associated with a recessive condition. We 
predict that most of these can carry an elusive hit on the 
other allele such as splicing defects or copy number 
mutation(s). A percentage of 15%, in fact, is a usual value 
for disease-causing variants not detectable by sequencing. 

The most interesting and quite surprising finding is, 
however, the very high number of rare damaging variants 



Table 3 Predicted enrichment costs and workload for single and pooled DNA samples 



Technical step 


Cost (€) 






Single 


PoolSeq 


Haloplex Kit (96 samples) 


1 6240,83 


4263,22 


Polymerase 


86 


22,575 


AMPure XP beads 


400 


105 


Validation and quantification of enriched target DNA 


386,8 


101,5 


Total (total per sample) 


17113.63 (213.92) 


4492.29 (56.15) 


Run Time 


Total Time (h) 






Single 


PoolSeq 


Enrichment procedure 


4days 


1day 
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identified and first the cases (26/177) with more damaging 
variants in other genes in addition to those classified as 
causative. These additional variants may have a potential 
modifier effect. This percentage of these genetically com- 
plex patients may be higher, if we consider that many other 
important muscular genes (even if not disease-causing) can 
also carry damaging alleles. We can easily predict that a 
broader NGS approach could strengthen this observation. 
We hypothesize that the intrafamilial and interfamilial 
phenotypic differences may be frequendy related to the 
combinations of multiple disease-causing alleles, more than 
to SNPs or CNVs. The so-called "modifier gene variants" 
could be individually rare, but collectively common. A com- 
prehensive view of all the genes involved in a pathological 
process helps to point out these alleles having a minor but 
probably not negligible role in the disease aetiology. 

The ultimate goal of MotorPlex is given by the pooling 
performances. The specificity and sensitivity values are very 
high and quite similar to those obtained in singleton 
testing and, above all, the diagnostic rate is not affected. 
The potential applications of pooling are just in large 
studies of complex and non-Mendelian disorders when a 
large number of samples have to be analyzed to improve 
the statistical power [40]. Considering our finding of 
multiple damaging variants in disease genes, these large 
studies are just around the corner. In addition, MotorPlex 
may discover low-allelic fraction variants in single samples, 
as in somatic mosaicisms. The pooled MotorPlex is 
likewise the cheapest genetic test (Table 3) ever presented 
that is able to screen 93 complex conditions at the cost of 
a few PCR reactions. 

In conclusion, we here demonstrate that MotorPlex can 
be used to identify accurately all DNA variants also in 
huge muscle genes: the platform overcomes for sensitivity 
and coverage the WES approach. In addition, Pool-Seq 
may be the first option to perform cost-effective popula- 
tion studies to understand polygenic conditions. We think 
that similar protocols could be designed to extend the 
NGS applications to other studies for human genetics, as 
well as for disease prevention, nutrition, forensics and 
many others. 
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